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ABSTRACT 

I present a new algorithm, CALCLENS, for efficiently computing weak gravitational lensing 
shear signals from large N-body light cone simulations over a curved sky. This new algo- 
rithm properly accounts for the sky curvature and boundary conditions, is able to produce 
redshift-dependent shear signals including corrections to the Born approximation by using 
multiple-plane ray tracing, and properly computes the lensed images of source galaxies in 
the light cone. The key feature of this algorithm is a new, computationally efficient Poisson 
solver for the sphere that combines spherical harmonic transform and multgrid methods. As 
a result, large areas of sky 10, 000 square degrees) can be ray traced efficiently at high- 
resolution using only a few hundred cores on widely available machines. Using this new 
algorithm and curved-sky calculations that only use a slower but more accurate spherical har- 
monic transform Poisson solver, I study the shear B-mode and rotation mode power spectra. 
Employing full-sky E/B-mode decompositions, I confirm that the shear B-mode and rotation 
mode power spectra are equal at high accuracy (< 1%), as expected from perturbation theory 
up to second order Coupled with realistic galaxy populations placed in large N-body light 
cone simulations, this new algorithm is ideally suited for the construction of synthetic weak 
lensing shear catalogs to be used to test for systematic effects in data analysis procedures for 
upcoming large-area sky surveys. The implementation presented in this work, written in C and 
employing widely available software libraries to maintain portability, is publicly available at 
http : //code . google . com/p/calclens. 

Key words: gravitational lensing: weak; cosmology: theory; methods: numerical 



1 INTRODUCTION 

Weak lensing analyses are an integral part of the scientific program 
of upcoming large-area sky surveys, such as the DE^ LSS10, 
Euclicll HSCfl KlD^l and Pan-STARR^ surveys. They will 
be used to study the growth of structure via cosmic shear (e.g., 
iHoekstra & Jainll2008h and will additionally serve as key followup 
measurements for cosmological constraints derived from the abun- 
dance of galaxy clus ters as a function of mass and redshift (e.g., 
[Weinberg et"aiTl2012h . The importance of these observations has 
motivated extensive studies of all aspects of weak lensing (WL) 
measurements, modeling, and theory. This diverse set of topics 
includes the process of measuring galaxy shapes and measuring 
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^ The Dark Energy Survey - http://www.darkenergysurvey .orgl 

Large Synoptic Survey Telescope - http://www.lsst.org 
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* Hyper Suprime-Cam - http://www.naoj.org/Projects/HSC 
^ The Kilo Degree Survey - http://kids.strw.leidenuniv.nl 

^ The Panoramic Survey Telescope & Rapid Response System - http://pan- 
staiTS.ifa.hawaii.edu 



WL shear from pixelized images (e.g., STEP l/2, Hexmanset_alJ 
|2006'; 'Massev et all |2007| or GREAT08/10, ISridle et all hOld . 
Kitching et al. 20121) . the construction of estimators and meth- 
ods to infer physically interesting quantities from cata l ogs of 



galaxy shapes and positions (e.g., Schneider et al. 199^ Selj^ 



19981: IHu & Whitell200ll: Ijarvis et alj|2004l: Ijohnston et al.ll2007al:^ 



Schneider & KilbingeJ l2007l : ISchneider et al.ll2010l : iHikage et all 



(2011: Becker 2012 ), photometric r e dshift estimators and their 
use (e.g., IComster& Lahavl l2004l: iMandelbaum etalJ l2008bl: 



ICunha et al.ll2009l : lOerdes et all 1201(3 : ICunha et alj|2012h. astro 



physical contaminant s to WL signals (e.g., iHeavens et alj l200d : 



tions to the first-order WL approximation itself (e.g., Jain et al 


200C 


Coorav&Hull2002l: 


Hirata & Seliak 20031: 


Vale & White 


2003 


Dodelson et al.ll2006l: 


Shaoiro & Cooravl 2006 


: Hilbert et al 


200? 


Krause & Hiratdl201C 


), and the basic matter and halo statis- 



Rudd et all 20081: 


Havashi & Whitd 20081: Hilbert & White. 


201c 


Guillet et al. 20 IC 
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^awre 


nee et alj 20 id: 


van Daalen et al. 


2011 


Casarinietal .11201 ll. 


201 2I: 


Takahashi et alj 


201 2|). 



In particular, cosmological constraints from WL analyses de- 
pend sensitively on the predictions of non-linear structure forma- 
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tion (e.g.. iHuterer & Takadal llOO^ . Thus both cosmological N- 
body (or N-body plus gas dynamic) and ray tracing simulations , 
which compute the shear field directly (e.g., 



Barber et al. 



Jain et alj|2000l: IVale & Whitel2003l ; iHilbert et alJl2009l : ISato et'al] 



199^: 



2009I) . must be used to o btain even basic predictions for the statis 



tics of WL signals (e.g ., ISemboloni et^bOOTl : ISato et alj[2009l 
l201lHKavoetalJl2012l) . While these different types of simulations 
are typically used to study how non-linear structure formation im- 
pacts the analysis of WL signals, they can also be used as the basis 
for synthetic WL galaxy shear catalogs for studying potential sys- 
tematic effects in the process of making and interpreting WL mea- 
surements themselve s (e.g.. IVale et al.lHQ04l : iHartlap et alj|201 ll : 
iHevmans et"ai]|2012h . In the coming years, as future surveys be- 
come larger and more complicated with ever increasing statistical 
power, the use of simulations in this way will become increasingly 
important for understanding WL measurements and ensuring that 
the cosmological parameter constraints derived from them are free 
of systematic errors. 

From the point of view of WL alone, there are several require- 
ments that these synthetic WL galaxy shear catalogs must meetQ 
Requirements directly focused on WL include computing redshift- 
dependent shear signals at the locations of the WL sources, prop- 
erly capturing the effects of magnification on WL source sizes 
and magnitudes, and computing the image positions of sources. 
Capturing magnification effects properly is important for testing 
methods to self-calibrate u ncertainties in shear measurements using 
magn ification signals (e.g.. lRozo & Schmidtll201(3l : IVallinotto et alj 
l201lh and direct measurements of magnific ation signals or cross- 
correlations of them with other signals (e.g., Yang & Zhang 20111; 



Heavens & Joac himi 201 1 ; Gaztanaga et al. 2012; Casaponsa et alj 
2012h . Also, these WL statistics should be computed for light cone 



simulations which model the complete past light cone of a fiducial 
observer[f| Additionally, any algorithm to compute the WL signals 
given an observer's past light cone should properly resolve the rele- 
vant physical (i.e., a few resolution lengths of the light cone simula- 
tion) and angular scales (i.e., a few to tens of arcseconds for captur- 
ing the properties of galaxy image positions). Finally, for large-area 
sky surveys like those in the near future, it will be advantageous to 
produce shear signals for the entire sky at once, accounting prop- 
erly for the sky curvature. 

This last point deserves special attention in particular. It is 
certainly possible to proceed by making a large number of small- 
area (~ 10^ square degree) WL simulations. This approach has 
several advantages. For a fixed dynamic range in the underlying 
N-body simulation, smaller area WL simulations will have bet- 
ter resolved structures. Also, as I will show below, the compu- 
tational methods needed for producing these simulations with a 
curved sky are difficult to simultaneously make fast and accurate. 
However, for certain scientific analyses, such as cross-correlation 
weak le nsing measurements with optically sel ected galaxy clus- 
ters (e.g.. |johnston et alj2007bl ; lMandelbaum etal. 2008a bj), usmg 
curved-sky ps eudo-CL techniques to measure cosmic shear power 
spectra (e.g., iHikage et al.l 1201 ih . or Cosmic Microwave Back- 
ground ( CMB) lensing cro s s-correlations with large scale struc- 
ture (e.g.. ISmith eTai]|2007l ; iHirata et alj|2008l ; iBleem et al.ll2012l ; 



^ There are of course other requirements, like realistic galaxy populations 
and observational effects like the confusion between stars and galaxies, but 
these are not the focus of this work. 

* These light cone simulations are typically built from standard structure 
formatio n simulations either on-the-fly or as a post-processing step. See for 
example lEvrard et al.l )2002l) . 



ISherwin et al.1120121 ; iFeng et al]|2012h . large-area WL simulations 
on a curved-sky are preferred. At their typical densities for a survey 
like the DES, galaxy clusters effectively fill the sky in degree-sized 
patches, so that simultaneously capturing both the large- and small- 
angle shear signals requires high resolution over large areas with 
the proper treatment of the sky-curvature and boundary conditions. 
Note that while angular scales below ~ 1 arcmin may not be used 
for cosmic shear because of theoretical uncert ainties in the mat - 
ter power spectrum from galaxy formation (e.g., iRudd et al.ll2008h . 
stacked weak lensing shear measurements around galaxy clusters 
at moderate redshi fts will probe these small angular scales (e.g., 
iGeorge et alj|2012h . motivating higher resolution calculations than 
those needed for cosmic shear. For optically selected galaxy clus- 
ters, the shear signals on these small scales can be biased low be- 
cause of offsets between the point about which the shear is mea- 
sured and_Jhe_truecento or density peak of the cluster 
(e.g.. Ijohnston et al]l2007bl) . exactly an effect one would like to 
model. Additionally, only simulations which cover the entire sur- 
vey area can address potential systematics on the largest scales. 
Finally, as I demonstrate below, curved-sky WL simulations can be 
used to study higher-order WL effects directly using full-sky spher- 
ical harmonic E/B-mode decompositions, which are complete in 
the te chnical sense and free of ambiguous modes (e.g.. lBunn et alj 
l2003h FI 

The method of choice for WL simul ations in thi s con- 
text is multip l e-plane ray tracing (e.g.. Barber et alj 



Hilbert et alj 



Jain et alj l2()00l; I Vale & w"hit3 l2003l ; |f orero-Romero et alj 



I99S 



2007 



200^ . Multiple-plane ray tracing simulations prop- 



erly co mpute redshift depend ent shear effects, second-order lensing 
effects jPodelson et al.l2005ll . all magnification effects, and can be 
used in conjunction w ith other methods, like grid search methods 
( ISchneideretai]|l992h . to compute the prope rly lensed images of 
sources in the light cone llHilbertetal.ll2009h . To date, multiple- 
plane ray tracing simulations either have been implemented in the 
flat-sky approximation for small areas (e.g.. lBarber etalJll999l; 



Ijain et alj2OO0l ; IVale & Whitel2003l ; rHilbert et alj2009h or account 
for the sky cu rvature properly but do not track the shear at each ray 
location (e.g., iTevssier et alj2009l) . WL simulations which employ 
the Born approxim ation in either the flat-sky or curved-sky co ntext 
exist as wefl (e.g.. iFosalba et al.ll20oi ; iKiessling et al.ll201 ih . It is 
also important to note that methods used for CMB lensing simula- 
tions operate on a curved-sky and are closely related or in fact iden- 
tical to those used for simulations of WL observations with galax- 
ies, except that they t ypically employ either a single lens pla ne (e.g., 
ICarbone et all2008l ; lDas & Bodell2008l ; ISehgal et allbOlOh or syn- 
thesize the W L deflection field assuming Gaussian statistics (e.g., 
lLewisll2005[) . 

In this work, I will present CALCLENS (Curved-sky 
grAvitational Lensing for Cosmological Light conE simulatioNS), 
a curved-sky, multiple-plane ray tracing algorithm which computes 
the entire lensing Jacobian at the ray positions, suitable for the con- 
struction of synthetic WL galaxy shear catalogs for large-area sky 
surveys. In Section|2l I describe the multiple-plane ray tracing algo- 
rithm and how it can be extended to the sphere. Additionally, in this 
section I cover the technical details of the physical approximations 
made when performing multiple-plane ray tracing on the sphere. 



For shear or polarization fields observed only over a finite patch of sky, 
there exist Fourier modes in the patch which cannot be uniquely classified 
as E- or B-modes. These modes, called ambiguous modes in the literature, 
ai'e not present when the shear or polarization field covers the entire sky. 
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showing that this algorithm, although formulated on the sphere, 
still works in the Limber approximation. I also describe the method 
to find the images of lensed source galaxies. I described the code 
implementation and parallelization in Section[3] In particular, I de- 
scribe a new method to solve the Poisson equation on the surface 
of the sphere, the most time consuming step of the multiple-plane 
ray tracing algorithm, which combines spherical harmonic trans- 
form (SHT) and multigrid (MG) methods. In the high-resolution 
limit this SHT plus MG (SHT-l-MG) method is significantly faster 
than pure SHT calculations. Note that these calculations are very 
sensitive to numerical errors which generate unwanted B -modes in 
the shear field. While the combined SHT-l-MG method presented in 
this work is not perfect in this regard, the numerical artifacts are 
well below the typical level of shape noise expected for upcom- 
ing surveys. Thus in the context of making synthetic galaxy shear 
catalogs, the SHT-l-MG method presented in this work is adequate 
and computationally efficient. Additionally, it can be used to com- 
pute the shear field over only part of the sky even more quickly. 
In Section|4l I present basic tests of the accuracy and performance 
of the new multiple-plane ray tracing code presented in this work. 
Then in Section |5] I study the power spectra of the convergence, 
E-mode shear, B-mode shear and rotation modes computed from 
the weak lensing maps constructed in Sections I2I4I verifying the 
relationship between the shear B-mode power spectrum and the 
rotation mode power s pectrum expected from perturbation theory 
( iHirata & Selia3l2003h . Finally, I conclude in Section[6] 



2 WEAK LENSING & MULTIPLE-PLANE RAY 
TRACING 

In this section I will describe the basic formulation of a multiple- 
plane ray tracing algorithm and how i t can be extended from work- 
ing in the flat-sky approxim a tion (e.g.. Barber et al .Il999l : ljainet"ai] 
l200d : IVale & Whit 3 12003| ; iHilbert et al.l l2009h to working with 
large areas on the sphere where the s ky curvature matters (e.g., 
iDas & Bodjl2008l : lTevssier et alj|2009h . I will review the relevant 
equations and focus specifically on the approximations made to ar- 
rive at them , as opposed to deriving them in thorough detail (see 
for example Ijain et al.]|2000l for a complete treatment). The essen- 
tial physical point is that this algorithm, even when formulated on 
the sphere, still w ork s in the Limbe r approximation ( lLimbeilll953l : 
see lKaiseilll992l and lKaise3ll998l for applications to weak lens- 
ing) so that the line-of-sight modes are integrated over. Thus only 
modes transverse to the line-of-sight which are much shorter than 
the typical line-of-sight integration length will be accurately com- 
puted. Only an algorithm which resolves modes parallel and trans- 
verse to the line-of-sight simultaneously will accurately reproduce 
the largest scale modes in the shear field. Finally, after reviewing 
the basic formulation of the multiple-plane ray tracing algorithm, I 
will describe the algorithm used to find the images of lensed source 
galaxies in the light cone. The details of the implementation and 
parallelization of these algorithms are discussed in Section[3] 



2.1 Multiple-plane Ray Tracing 

My description of the weak lensing equations and the multiple- 
plane ray tr acing algorithm follo ws very closely that of Jainetal] 
hoOOl) an d Hilbert et al. (2009) (see also e.g., iDodelsonI I2OO3F 
IVale & WhitQi2003t ; iDas & Bodai2008l) . The weak lensing equa- 
tions can be written in the small-angle limit as 

n(x) - nx - 0) 



(1) 



/3(x = 0) is the angular position of the light ray at the observer, 
$ is the gravitational potential, and x is the comoving location of 
the light ray. The quantity r{x) is the comoving angular diameter 
distance defined as 

{ 7^-^/2 sin{K^/'^x) for Tt" > , 

X for 7^ = , (2) 

(-A')^/^ sinh((-A)~^/2x) for K <Q , 

where K is the spatial curvature, K — —Q,kHq/(?, Hq is present 
day Hubble constant, and is the present day value of the spatial 
curvature. The comoving distance to the epoch with scale factor a 
is given by 



da' 



a"H{a') 



(3) 



where H{a) is the Hubble parameter. For a flat ACDM cosmology 
H{a) — //oV^^mCi"'^ + Side with Side = 1 — ^m - Here Qrn is the 
present epoch value of the matter density in units of the critical den- 
sity and fide is a similar parameter for the dark energy. For models 
with time-varying dark energy, spatial curvature, o r radiation, the 
expression for H{a) is more complicated (see e.g.. lAlbrecht et all 

Note that the integral in Equation ([TJ is evaluated along the 
light ray's trajectory and thus is an implicit equation for /3(x). 
The gradient in this equation is defined in the small-angle limit 
as V IS = {d/dfii, 8/8/32) where {Pi, 132) describe a coordinate 
system orthogonal to the light rays trajectory. Formally, the gra- 
dient in the previous equation should be evaluated transverse to 
the light ray's current direction of travel. However, numerical tests 
have shown that using V g instead causes a negligible amount of 
error jVale & Whitel2003h . 

Equations describing the evolution of the lensing Jacobian A 
(also called the inverse magnification matrix) along the light ray's 
trajectory can be derived by simply taking another set of derivatives 
of Equation QTl with respect to the location of the r ay at the observer 
/?(X = 0) (cf ]jain et al.ll200(]l : lHilbert et al.l2009l) : 



3 / '^x 



,r(x-x') d^HP{x'),x') 



r(x)r(x') 8p,{x')dPk{x 



T,A,AX) 



(4) 



Repeated indices are summed over here and below. The inverse 
magnification matrix A is typically decomposed into four parts de- 
scribing the transformations applied locall y to the light rays as th ey 
propagate through the matter distribution jSchneider et al.|[T992h 



A 



1 — K — 71 —72 + 
— 72 — U) 1 — K + 71 



(5) 



where I have assum ed that the rotation angle uj is small, as in 
IVale & White! l l2003l) . The parameter k. is the convergence, 7 = 
71 + 172 defines the complex shear and describes the shearing of 
the image, and w defines an overall rotation of the lensed image. 

In general, the shear field can be split into two types of modes, 
E- and B-modes, w hich differ in thei r transformation under par- 
ity (see, for example. iBunn et aDl2003l for a complete description). 
At first order in the gravitational potential "I>, where the equations 
above are evaluated along the unperturbed light ray trajectory, weak 
lensing by large scale structure only generates E-modes. This limit 
is known in the literatu re as the Bom approximation. As studied in 
several previous works jCoorav & Hij2002l : lHirata & Seliakl2o"o3l : 
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iKrause&Hirat^llOlOl) . deviations from the Born approximation 
generate B -modes in the shear field. Additionally, it can be shown 
that at second order in "l? on small scales t he power spectra of th e 
rotation and the B-modes should be equal jHirata & Seii^l2003[) . 
As I will show below, numerically verifying this relation is quite 
difficult but the simulations support this claim. Note that at small 
angular scales all lensing effects at second order in t he gravitational 
potential are captured by ray tracing simulations jPodelson et alj 
I2OO5I : iBemardeau et alJl2oToh . Finally, the coupling of the partial 
derivatives of $ and the inverse magnification matrix A in Equa- 
tion Q is known as the lens-lens coupling. This coupling, along 
with the deviations from the Born approximation itself, comprise 
all of the second order terms in the lensing equations at small 
scales. 

In order to derive a multiple-plane lensing algorith m, one di- 
vides the matter along the line of sight into s labs ('e.g.. |jain et alj 
I2OOOI : IVale & Whit3l2003l : iHilbert et alj|2009h or in the case of a 
spher ical geometry, shells (e.g.. Das & Bodd 20081 : iTevssier et alj 
I2OO9I) . To do this, let Xm denote the comoving distance to the mid- 
dle of the m-th lens plane of which there are total lens planes. 
Additionally, let the thickness of each lens plane be Ax so that the 
mth lens plane subtends in comoving distance from the observer 
the range 

(Xm - Ax/2, Xm + Ax/2) = (Xm-1/2, X»n + l/2) ■ 

Then the lensing equations can be written approximately as 



(JV) 



(0) 



JV-1 

E rjXN - Xm) (m) 
r{XN) " 



m=0 
AT-l 



A 



(N) 



E 



r{Xf 



r(XN 



Xm) jrr(m) <{m) 
'-'ik ^kj 



(6) 



(7) 



where af"^ = dp^ip^'^^ and (7^"' = dfi,0^tp'-"'\ The quantity 
■^t™) will be called the lensing potential and is defined as 



(m) 



'1/2 , 2 

dx . 

Xm-1/2 



(8) 



This definition differs from that used bv lVale& Whit3 ( l2003h due 
to how factors of the comoving angular diameter distance are fac- 
tored out of Equat ions ([TJ and but is consistent with that in 
lHilbertetalJ ( l2009h . 

Numerically ■0''"' can be computed in the multiple-plane 
lensing and Limber approximations from a two-dimensional Pois- 
son equation 



(m) 



= r{Xm) 



Xm-1/2 



3Hg rjxm) 
c2 a(xm) 



Xm + 1/2 



dx'S: 



Jm) 



(9) 



where 5 is the matter overdensity in the light cone and the sym- 
bol Vx is the transverse Laplacian with respect to comoving co- 
ordinates. This definition of the convergence differs by a factor of 
two relative to the standard definition (Kstandard = /2), but 
is more c o nvenie nt for numerical work. As explained in detail in 
Ijain et alj ilOOd) . when the range of integration is the entire line 
of sight one can derive this last equation by integrating by parts, 
using the Poisson equation to relate the integral of the gravitational 
potential $ along the line of sight to the matter over density, and 
neglecting long wavelength fluctuations along the line - of-sig ht via 
the Limber approximation. As argued bv lDas & Bodd j2008h . this 



approximation is f ormally quite p oor for very thin lens planes as 
written above. Also lLi et aljj201ll) note that the above equation ne- 
glects extra terms that are non-zero when one is only considering a 
finite width lens plane, although for light cones constructed so that 
the matter is continuous along the line of sight these terms cancel 
when summed over the lens planes. However, notice that the quan- 
tities of interest, namely the locations and inverse magnification 
matrices at the final positions of the rays in Equations ^ and Q, 
are computed as sums over all of the previous lens planes. Thus, to 
the extent that the corrections to the Born approximation are small, 
so that the partial derivatives transverse to the line of sight com- 
mute with the integral along the line of sight, the cancellation of 
the line-of-sight modes as required by the Limber approximation 
still occurs. Finally, below I use light cones constructed continu- 
ously on the fly as the N-bo dy simu l ation is running in order to 

avoid the effects discussed in lLi et alj ( 1201 ih . 

Following pre vious authors (e.g., IPas & Bodd l2008l : 
ITevssier et all 1 20091) . this formalism can be extended to the 
sphere by promoting all of the transverse derivatives in the above 
equations to gradients and/or Laplacian operators on the sphere. It 
is important to note that although the multiple-plane ray tracing 
algorithm can be formulated on the sphere in this way, the algo- 
rithm still will not resolve the largest scale fluctuations transverse 
to the line-of-sight properly. The problem lies in Equation ^ as 
described above. A multiple-plane ray tracing algorithm neglects 
long-wavelength fluctuations along the line-of-sight and so cannot 
self-consistently resolve those same fluctuations transverse to the 
line-of-sight. By formulating the algorithm on the sphere in this 
way, one is working locally in the small angle approximation 
at every point on the sphere and has used the gradients and/or 
Laplacians on the sphere to account for boundary condi tions and 
the r otation of the basis vectors from point to point jStebbinj 
For a survey like the DES, this approximation is expecte d 
to fail at very large angles, say £ < 20 l lLoverde & Afshordill2008h , 
and at very low redshifts where the line-of-sight integration length 
is quite short. 



2.2 Light Ray Propagation 

Several difficulties arise when adapting Equations ^ and (Q to the 
sphere. Equation ^ is only correct in the small-angle approxima- 
tion. Additionally, Equation Q requires that at each step of the ray 
tracing algorithm one have the quantity from all previous lens 
planes available. This requirement is prohibitively expensive for the 
large-area calculations presented in this work. There are however 
more efficient methods w hich require only the A^^-* from the previ - 
ous two lens planes (e.g.. lVale & Whitdl2003l : IHilbert et al]|2009h . 
In ord er to address t hese i ssues , I use a comb i nation of the meth- 
ods of iHilbert et"aD l l2009h and ITevssier et al.1 j2009h to propagate 
the rays from lens plane to lens plane. The results are reproduced 
here for completeness. Additionally in Appendix [Cl I demonstrate 
that the method I use for the lensing Jaco bian, although form ally 
derived in the flat-sky approximation by IHilbert et alJ ( l2009h . is 
equivalent to the relations given in Equation ([7) and thus appro- 
priate for the full-sky calculations presented here. 

I use the method presented in Appendix A of ITevssier et alj 

( l2009l) to propagate the ray positions from plane to plane and to 
update their propagation directions. In this method, the deflection 
angle at lens plane m, a'™' = Vi/'''"-', and the ray propagation 
direction before lensing /S'™^^' are used to compute the new ray 
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propagation direction /j'"*' through the rotation 
/3("'^ = 7^(n' 
/3' 



(m) , , (m) 



(10) 



(0) 



,(0) 



where 7?.(n, 6) is a rotation matrix which rotates vectors by an an- 
gle Q counter-clockwise about the axis n and n'™' is a unit vector 
in radial direction at the current location of the ray on the sphere 
at lens plane m. Using the updated propagation direction /3'™\ the 
ray's new Cartesian position x''""^^^ is given by 



□ (m) 



+ 2A(x('"' . /3("" ) + - X™+i = 0, A > (11) 



,(0) 



(0) 



This method is a purely three-dimensional, local version of Equa- 
tion l[6} and is consistent with how lensing of Cosmic M icrowave 
Background temperature fields is done (e.g.. lLewisl2005l) . 

The lensing Ja cobian is propagate d from plane to plane via 
(cf. Equation (16) of iHilbert et alj|2009h 



A 



(n+l) 



= 1- 



'■(Xn+i) r(xn-X"-i) / 

"'■{X-n) ?-(Xn + l - Xn-l) . (n) 



^•(Xn + l) ''(Xn-Xn-l) 
'^(Xn + 1 



Xn) rr(n) An) 



(12) 



A 



(0) 



— (5i. 



I account for the change in the local tensor basis as the rays move 
from point to point on there sphere by parallel transporting the 
inverse magnification matrix (a tensor on the sphere) along the 
geodesic connecting connecting the old ray position with the new 
ray position. This procedure is exactly that used for lensing polar- 
ization fields l lChallinor & Chonll2002h but has been adapted here 
for the case of general tensors on the surface of the sphere, as de- 
tailed in Appendix IB] 



above would assign to each galaxy the shear and image location 
at the middle of each lens plane. Although the correction is small 
in practice, it is just as easy to propagate the rays used for find- 
ing the galaxy image positions and inverse magnification matrices 
from the middle of the lens plane to the galaxy's exact comoving 
distance, as described in Appendix |D] When this step is done the 
derivatives of the lensing potential are set to zero. Note that in 
Equation i ll It one now no longer requires A > and uses the 
solution for the new ray position which minimizes the comoving 
separation between the old and new ray positions. The radial inter- 
polation defined here does not violate the condition that a galaxy 
should only be lensed by matter in front of it. Additionally it com- 
putes the shear and location of the galaxy image using the proper 
lensing geometry. Using the shear at the ray locations without the 
radial interpolation is equivalent to computing the lensing Jacobian 
for the galaxy as (cf. Equation |4j 



A-ijiXn) — Sij 



, , r(xn-x') dMP{x:),x!) , 

^ r(x„)r(x')9A(x')aA(x') "'^^ ^ 



(13) 



where Xn is the comoving distance to the middle of the lens plane 
which contains the galaxy (i.e. Xn-i/2 < Xg =5 Xn+i/2 with 
Xg is the comoving distance to the galaxy). Notice that the lensing 
kernel itself is wrong in the absence of the radial interpolation. By 
using the radial interpolation, I compute 



,. r{xa~x') a^'^W),x') , 
^ r(xG)r(x')aA(x')9/3fe(x') '^^^ 



(14) 



where I assume ^(/^(x')! x') = for x' ^ Xn-i/2- Thus the 
proper lensing geometry given a galaxy's location and the matter 
that lenses it (i.e. all matter within a distance Xn-i/2 of the ob- 
server) is enforced. This radial interpolation is optimal at lowest 
order in the lens plane width in the sense that it respects the basic 
geometric and physical properties of weak lensing. 



2.3 Finding Galaxy Images 

Using the formalism above, a set of light rays can be propagated 
from the observer back to any source redshift desired. However, 
what is needed is the image location and shear for the galax- 
ies' source locations, not t he source locations of the rays. I use 
a grid search ( c f., jj3 .4 of iHilbert et alj EoOSl) . as described by 
[Schneider et al.l ( 1 19921) . to obtain image locations and shear val- 
ues for the galaxies. In the image plane a set of triangles is built 
out of the initial ray positions. For a square grid of ray images, this 
construction is quite easy. The method used to define the triangles 
for our grids is described below in ij3] The ray tracing defines a 
mapping between a triangle on the image plane and the equivalent 
triangle composed out of the same three rays on each source plane. 
Every galaxy resides on a single source plane determined by its 
comoving distance from the observer. For each galaxy on a given 
source plane, every triangle on that source plane is checked to see if 
it contains the galaxy. If it does, the image position of the galaxy is 
computed using an interpolation over the values at the vertexes of 
the triangle. A separate interpolation is used to compute the inverse 
magnification matrix at the galaxy position as described below. 

While the interpolation over the triangle vertices defines an 
interpolation in the angular direction, the grid search as described 



3 CODE IMPLEMENTATION AND PARALLELIZATION 

In the previous section I have described the equations and algo- 
rithm used for the full-sky weak lensing simulations presented in 
this work. The basic steps of the algorithm are 

(i) Initialize the rays at the observer using Equations l llOt - l ll2t . 

(ii) Solve Equation l|9j, a two-dimensional Poisson equation on 
the sphere, for the lensing potential -i/;''"^ on the mth lens plane. 

(iii) Compute the derivatives of i/j'™' , a^™' and Ul™'^ , and then 
propagate the rays to the m + 1th lens plane using Equations jlOt 
-(III. 

(iv) For galaxies within the m + 1th lens plane, find their im- 
ages and interpolate the lensing Jacobian onto them as described in 
Sectionl23l 

(v) Repeat this process until the last lens plane is traversed. 

In this section I will describe how each of these steps has been 
implemented in CALCLENS and how various computational is- 
sues have been addressed. For example, the most time consuming 
and difficult of these steps is solving the two-dimensional Pois- 
son equation on the sphere for the lensing potential i/)'-'"^ The so- 
lution to this issue presented in this work, the SHT-l-MG method. 
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can speed these calculations up by an order of magnitude or more, 
at the cost of introducing extra B -modes in the shear field. Suf- 
ficiently high resolution calculations require either shared or dis- 
tributed memory systems and efficient parallelization in order to 
be practical. The implementation presented in this work is written 
in C and parallelized with MP1^°I in order to maintain portability 
and good performance across a variety o f systems. I also m ake ex- 
tensive use of the HEALPi]^!! libraries dOorski et a"ill2005h for all 
steps of the computation. The rest of this section is organized as 
follows. I cover the domain decomposition and layout of the rays 
in Section lSTl the SHT-l-MG method for solving for the lensing po- 
tential in Section [T2I the various interpolations needed to find the 
lensed galaxy images and lensing Jacobians in Section [33] and the 
spatially indexed light cone formats for quick access to data on disk 
in Section [J!4l In Section [J!4l I also discuss how to work with light 
cones which do not cover the entire sphere. 



3.1 Ray Layout & Domain Decomposition 

The rays in the calculation are placed on the z = Q plane at the ob- 
server on HEALPix cell centers. In order to efficiently parallelize 
the computation, I then organize all of the rays into bundles using 
lower resolution HEALPix pixels, which will be denoted as bun- 
dle cells. Using the nested indexing of the HEALPix cells, one can 
compute which bundle cell each ray belongs to with quick bit shift 
operations. The bundle cells with their respective rays are then dis- 
tributed across processors using Peano-Hilbert indexing available 
in the HEALPix package. (See Figure [T] for an example domain 
decomposition.) Gravitational lensing deflections are quite weak, 
so that each ray moves at most a few arcmin from its original posi- 
tion when propagated out to redshifts of a few. Thus I can use small 
buffer regions of rays and N-body particles when needed so that the 
rays can remain attached to the same bundle cell in which they start 
in order to simplify the structure of the code. Most calculations typ- 
ically employ bundles of rays with HEALPix resolution parameter 
NSIDE equal to 64 or 128, producing 49,152 to 196,608 bundles 
covering the full sphere. Each bundle cell is approximately 0.5 to 
1 degree across in size. Operations like propagating the rays from 
lens plane to lens plane and finding the galaxy images are com- 
pletely spatially local and thus embarrassingly parallel over this 
domain decomposition. As I will discuss below, the most time con- 
suming operation in the Poisson solver used in this work is also 
spatially local and thus embarrassingly parallel over this domain 
decomposition as well. 



3.2 Solving for tlie Lensing Potential 

The Poisson equation is inherently non-local and so can be par- 
ticularly difficult to solve in large parallel computations. The 
approach taken in this work is conceptually similar to that 



(e.g., Hocknev 
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equation globally at low resolution. Then each MPI task uses this 
global solution to the Poisson equation in conjunction with some 
other method, a MG solver in this work, to compute a higher res- 
olution solution to the Poisson equation locally. This last step is 
typically highly spatially local, so that it can be efficiently paral- 
lelized with little communication between the various MPI tasks. 

The SHT can be used to solve the Poisson equation on the 
sphere by first decomposing the source term k, into its spherical 
harmonic coefficients kim where I and m index the spherical har- 
monics Yim{0,(j))- This operation is typically called the analysis 
operation. Then the solution to the Poisson equation is computed 
in harmonic space via 



(15) 



weak lensing). Namely, the solution of the Poisson equation is 
split into two steps. The first step, implemented in this work with 
SHTs, involves tightly coupled, communication heavy computa- 
tions amongst the various MPI tasks in order to solve the Poisson 



where iptrn are the spherical harmonic coefficients of the lensing 
potential -0. The I = Q terms are set to zero by hand in this equa- 
tion. Finally, the lensing potential ip is reconstituted in real space 
by inverting the analysis operation above. This last step is typically 
called a synthesis operation. The advantage of using the SHT anal- 
ysis and synthesis operations to solve the Poisson equation is that 
they automatically enforce the proper boundary conditions on the 
sphere. However, the SHT synthesis and analysis operations have 
running times which scale as 0{N'^^'^) with the number of resolu- 
tion elements and are thus unsuitable for high-resolution calcu- 
lations. The SHT analysis and synthesis operations in this work are 
implemen ted over HEALPix using fast SHT algorithms described 
in §6.7 of IPress et alj l l2007h . In this work they have been paral- 
lelized with MPI in a manner similar to the S 2 HAlF^ package. The 
needed Fast Fourier Transforms are computed with the FFTvQ 
package jFrigo & Johnsonll2005h . Note that the implementation of 
these operations is completely parallel in memory usage as well. 

Second, in order to improve on the 0{N^''^) scaling of the 
SHT synthesis and analysis operations at high-resolution, I use the 
lensing potential from the SHT step to supply boundary condi- 
tions to a high-resolution Poisson solver which employs a finite- 
difference approximation to the Laplaci an on the sphere and re- 
laxation methods with MG acceleration iFedo renko l ll96ll : lBrandl 
Il973i) . I use t he full approx imation scheme (FAS) to imp lement the 
MG method ( lBrandllll97i see also lHahn & Abelll20I ll for a clear 
explanation of the method). The FAS scheme is implemented in 
this work as follows. First for a given bundle of rays, a new coordi- 
nate system is constructed at the center of the bundle with it located 
at (6, (j>) = (7r/2, 0) in the new coordinate system. Here (6, <j)) are 
the angular coordinates on the sphere. The angular size of the MG 
patches themselves is set to four times the size of the ray bundles, 
so that they are typically ~ 2 — 4 degrees across. This coordinate 
system employs lines of constant 9 and constant (f) to define cells 
on the sphere with equal spacing so that A9 = A<j}. Near the lo- 
cation (7r/2, 0) on the sphere, these cells have approximately the 
same area and aspect ratios near unity. The number of cells used at 
the finest level of the MG patch is set so that the particle smoothing 
kernel, described below, is covered by ~ 4 — 5 pixels and there are 
at least 256 pixels on a side in the patch at the finest level. 

Next, using this local pixelization and the boundary condi- 
tions from the SHT solution to the Poisson equation, I use red-black 
Gauss-Seidel relaxation with MG acceleration via the FAS scheme 
to solve for the lensing potential. The boundary conditions are in- 
terpolated from the HEALPix grid using the linear interpolation 



http : / /www . mpi - forum . org| 
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Figure 1. An example domain decomposition and set of spherical triangles used for the grid search for galaxy images. The left panel shows the domain 
decomposition for 16 MPI tasks and ray bundles with NSIDE = 32. Each color denotes a different domain. The right panel shows an example set of triangles 
constructed out the HEALPix cell centers for NSIDE = 8. The cell centers are shown as the blue points, the cell edges as the solid black lines, and the triangle 
sides are shown as the red dashed lines. Note that the triangle edges are geodesic curves whereas the HEALPix cell boundaries are not. 



routine available in the public HEALPix package. The discretiza- 
ti on, relaxation, restriction, and prolongation methods follow that 
of iBarrod l fl993) . except that I use direct injection instead of lin- 
ear interpolation for prolongation near the boundaries to maintain 
stability and the averaging for the restriction operation is strictly 
conservative in this implementation. Note that the rays for each 
bundle cell are located in the center of the MG patches on average. 
Thus by moving the boundary of the MG patch, where the trun- 
cation errors are the largest, away from the rays, one can limit the 
effect of the truncation errors. In the center of the MG patches, the 
MG solver can in principle converg e to ma chine precision. In prac- 
tice, I use the formalism in §20.6 of lPress et al. ( 2007) to make sure 
the MG solver is only run until the residuals on the finest grid are 
slightly less than the truncation error introduced by the boundary 
condition interpolation scheme. The convergence parameter, called 
e in this work, is defined specifically as the ratio of the L2-norm of 
the residuals to the L2-norm of the truncation errors. Typical one 
uses e ~ 0.1. The proper tuning of this convergence criterion can 
significantly speed up the calculations. Finally, I use the SHT solu- 
tion of the Poisson equation to supply the starting guess for the MG 
code. This optimization significantly speeds up the convergence of 
the MG routine. The accuracy and performance of the combined 
SHT-l-MG Poisson solver is discussed below in Section |4l In par- 
ticular, the choice of the resolution of the HEALPix solver relative 

to the MG patch solver is explored i n detail. 

I use an Epanechnikov kernel l lEpanechnikovl [T969I) normal- 
ized to unity on the sphere in order to adaptively smooth the mass 
density field sampled by the partic les of the N-body s imulation on 
each lens plane. This kernel is (cf . , iHilbert et al.ll2009h 



K{e- a) = 
N{a) = 



1 



1 - 



27r(sinc^((T/27r) — 2smc{a /-n) + 1) 



(16) 



where sinc(2;) = sin(7ra::) /-kx. The Epanechnikov kernel was cho- 
sen because it is compact and computationally efficient to imple- 
ment. The smoothing length for each N-body particle is set to the 
larger of either a few N-body softening lengths or a few inter-ray 
spacings. The constraint that the smoothing length be larger than 



a few inter-ray spacings serves to limit the aliasing of small-scale 
power from the N-body particles to the HEALPix ray grids. The 
exact values used are somewhat arbitrary and one typically em- 
ploys ~ 2 — 3 softening lengths or inter-ray spacings. The con- 
vergence field is constructed by binning the smoothing kernel over 
the HEALPix map for the SHT. A similar procedure is employed 
for the MG patches. I enforce exact mass conservation during this 
operation. 

The partial derivatives of the lensing potential needed to prop- 
agate the rays are computed from the finest level of the MG 
patch using fourth order finite difference stencils. Special asym- 
metric stencils are used at the edge of the finest level patch (see 
Ipornberd 19981 for simple algorithms to generate these stencils) and 
bilinear interpolation is used to get the derivatives at the ray loca- 
tions. One must be careful to account for the non-zero Christoffel 
symbols for the metric on the sphere when computing these deriva- 
tives (see AppendixlAt. Once the deflection angle and second-order 
derivatives of the lensing potential are computed at the ray loca- 
tions, they are rotated back into the original coordinate system be- 
fore the rays are propagated to the next lens plane. 

Finally, I have also implemented a pure SHT algorithm which 
synthesizes the needed derivatives of the lensing potential directly 
from its spherical harmonic coefficients. Pure SHT computations 
are slower in the high-resolution limit, but serve as good com- 
parisons to the faster, but ultimately noisier SHT-l-MG computa- 
tions described above. Also, for low-resolution shear fields, these 
computations are of comparable speed to the SHT-l-MG computa- 
tions presented here and are thus very useful. Typically one must 
use rays on HEALPix grids with NSIDE = 8192 or higher for the 
SHT-l-MG Poisson solver to be computationally more efficient than 
a pure HEALPix grid. For a DES-like survey with galaxy clusters 
at redshift 0.8, one will need ~ 20 arcsec effective resolution in 
the shear field to resolve the shear profiles at « 100 /i^^kpc. This 
will require calculations with NSIDE = 16384 or higher, depend- 
ing on the numerical implementation. In this regime the SHT-l-MG 
Poisson solver will be significantly more efficient then a pure SHT 
method. 
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Figure 2. Point mass tests of the SHT+MG Poisson solver ran in typical configurations. The left set of panels show the fractional error in the deflection angle 
a and the right set of panels show the fractional error in the tangential shear. Each panel coiTesponds to a different configuration of the SHT+MG Poisson 
solver, specified by the HEALPix order used for the SHT step (denoted as SHT), bundle cells (denoted as Bundle) and and the MG convergence parameter t. 
The SHT+MG Poisson solver is unbiased at high precision (< 0.1%), while it has typical RMS errors of 1-2% and has maximum absolute errors of no more 
than 5% or better. 



3.3 Implementing the Grid Search 

In order to implement the grid search for the galaxy images, I use 
HEALPix pixel centers to construct a set of triangles which cover 
the sphere at the image plane. This construction is illustrated in the 
right panel of Figure [T] I then use a fast tree code implemented in 
HEALPix to search for all rays within a small radius of the source 
galaxy at the source plane. This radius must be large enough to not 
miss any galaxy images, but small enough to be computationally 
efficient. Typically, radii of a few arcmin meet these requirements. 
Any triangle which has one of its vertices found near the source 
galaxy is tested to see if it contains the source galaxy at the source 
plane. The test for whether or not the triangle contains the galaxy 
and the linear interpolation for the galaxy image position are im- 
plemented ijsingbarycenter_£^ in §21.3 of 
IPress et aP ( l2007l : see also lLanger et aljr2006l) . The bary center co- 
ordinates require the triangle to be contained in a plane as opposed 
to the surface of the sphere. Here I use the projection into the tan- 
gent plane to sphere computed at the galaxy source location to per- 
form these tests. The test for the galaxy being in the triangle is 
unaffected by this projection and for sufficiently small spacings be- 
tween adjacent rays galaxy images are still interpolated accurately. 
To compute the inverse magnification matrix at the galaxy's image 
location, I use a separate linear interpolation over four ray image 
locations near the galaxy in the image plane (chosen according the 
interpolation routine available in the public HEALPix package). 
The lensing Jacobains are parallel transported to the galaxy image 
position before being used in the linear interpolation specified by 
the HEALPix package. In general, when a pure E-mode shear field 
is interpolated to a new a set of points, extra B-mode power can 
be introduced into the shear field (Hilbert et al. 20()9i). This sec- 
ond interpolation combined with the parallel transport corrections 



limits this effect. Finally for distributed memory computations, a 
buffer region of rays is imported from nearby domains so that no 
images are missed. The width of this buffer region is set in prin- 
ciple by the maximum deflection for any photon from its observed 
position over its entire path. In practice because these computations 
rarely resolve strong deflections and are done at moderate redshifts, 
a buffer region of ~ 10 arcmin is more than sufficient. 

3.4 Working with Light Cones and Finite Patches of Sky 

The ray tracing algorithm is designed to work directly on N-body 
light cones so that it can be used easily with mock galaxy catalogs 
derived from the light cone data sets. These light cone data sets are 
several hundred GB to many TB in size. Before the ray tracing, the 
N-body light cone data is organized into a spatially indexed format 
using HEALPix and HDf£3, so that each processor can quickly 
find the data it needs. In order to handle light cones which only 
partially cover the sphere, I set the matter density in the light cone 
in the regions outside the domain of interest to the mean density of 
the universe. For the lensing equations presented above, rays which 
traverse a region of the universe with matter at its mean density 
will experience no lensing deflection. This procedure limits edge 
effects in the shear and deflection fields for light cones with partial 
sky coverage. In some applications it is of interest to track rays 
over only part of the sphere, but still account for the influence of 
matter in the light cone outside of the domain of interest. For the 
SHT+MG Poisson solver, this effect can be achieved by running 
the SHT step over the full sky, but running the MG step only over 
the domain of interest. 

14 www.hdfgroup.org/HDF5/ 
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Table 1. Flat ACDM Light Cone Simulations 



Simulation 


Light Cone Area 


Box Size 


Number of Particles 


Force Softening 


Carmen 


220 deg2 


1000 fe-^Mpc 


1120^ 


25 h-^kpc 


Lb2600 


all-sky 


2600 /i^^Mpc 


2048^ 


35 h~^hpc 



See Section|4]for details. 



4 BASIC CODE AND SCALING TESTS 

In this section I present basic tests of the algorithm and code cor- 
rectness, along with parallel scaling tests. Section|4jT]has tests us- 
ing point masses and various configurations of the SHT-l-MG Pois- 
son solver. These tests illustrate the basic correctness of the code 
and the dominant source of error in the Poisson solver - the bound- 
ary condition interpolation onto the MG patches. In Section l4!2l I 
test the accuracy and convergence of the grid search implementa- 
tion described above. Finally in Section 143] I present strong and 
weak parallel scaling tests of the code performance for the typical 
kinds of calculations the code is designed to perform. 

The tests presented in this section use two different N-body 
light cone simulations. The first is a 220 deg^ light cone (M. Busha 
& R. Wechsler 2012, private communication) constructed from 
one of the Carmen simulations produced as part of the LasDamas 
project (C. McBride et al., in preparationj^ The Carmen simula- 
tion was run with Gadget-2 dSpringel et alj 



200ll : ISpringell2005l) 



with s econd-order Lagrangian perturbation theory initial condi- 
tions jCrocce et alj 2006[) and initial pow er spectrum generated 
with CMBFast i Zaldarriaga & Seliakl2000l) . The second is a large- 
volume simulation run with L-Gadget2 (an optimized version of 
Gadget-2 for large, dark matter only simulation s), second-order La - 
grangian perturbation theory initial conditions l lCrocce e t al.ll2006h 
and i nitial power spectrum generated with CAMB i Lewis et alj 
I2OOOI) . This last simulation, denoted below as Lb2600, features an 
all-sky light cone generated on the fly. It was generated as part of 
the Blind Cosmology Challenge simulation project to be described 
in M. Busha et al. (in preparation). Table [T] lists the properties of 
these simulations, all flat ACDM models. 



4.1 Point Mass Tests 

The basic parameters which control the SHT-l-MG Poisson solver 
are the size of the SHT grid, the smallest cell size used in the MG 
step, the size of the bundle cells, and the degree of convergence 
required before the the MG iterations are terminated. I will de- 
note the size of the SHT grid by the HEALPix order, defined via 
2^ = NSIDE. For HEALPix, each increase in order by one de- 
creases the area of the cells by exactly a factor of four (all HEALPix 
cells at a given order are equal in area) and thus the average mean 
inter-spacing of the cell centers by approximately a factor of twoF^ 
In the following tests, the smallest MG cell size is kept fixed, while 
the SHT order is varied. Additionally, I vary the sizes of the bun- 
dle cells and the MG convergence parameter e. This convergence 
parameter is proportional to ratio of the residuals to the truncation 
errors on the finest grid, as described above in Section [T2l 

Fractional errors from the SHT-l-MG Poisson solver using 
point masses and varying these parameters are shown in Figure |2] 



Lai'ge Suite of Dark 
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for order k. 



The left panels show the fractional errors in the deflection angle and 
the right panels show the fractional errors in the tangential shear. 
Each panel uses a different configuration denoted by the order of 
the SHT step (12 or 13), the order of the bundle cells (6 or 7), 
and the convergence threshold e (10~^ or 10~^). The smaller the 
parameter e, the higher the degree of convergence of the MG al- 
gorithm. The exact choice of configuration is generally motivated 
by computational considerations and parallel scaling, as discussed 
below in Section l431 The configurations shown are typical for ray 
tracing calculations which cover a few hundred square degrees or 
more. 

Several trends are apparent from this figure. First, as the SHT 
order is increased, so that the size of the SHT cells approaches 
the size smallest MG cell, the fractional errors decrease. This de- 
crease happens because the boundary condition interpolation be- 
comes more accurate as the SHT and MG cell sizes become compa- 
rable. Second, as the sizes of the bundle cells are changed, the radial 
location from the point mass where the fractional errors are largest 
shifts accordingly. These "spiky" features in the residuals are from 
bundle cells where the point mass is near the boundary of the MG 
patches, so that the boundary condition interpolation is especially 
poor. Note that these errors are fixed spatially at the edges of each 
bundle cell, but that for a real mass distribution strong sources will 
be located randomly along the edges, so that these errors will tend 
to partially average out. These tests are not shown here, but if one 
increases the size of the MG patches for a fixed bundle cell size and 
smallest MG cell size, the boundary condition interpolation errors 
also decrease. This decrease occurs because the MG patch bound- 
aries are moved further from the rays located at the center of the 
MG patch. Finally one can see that when the MG step is run to 
higher convergence by using a smaller value of e, the fractional er- 
rors decrease as well, until they approach the truncation errors. 

From these tests it is clear that the dominant contribution to the 
truncation error is from the boundary condition interpolation. Over- 
all, the SHT-l-MG Poisson solver has typical RMS errors of 1-2%, 
while it is unbiased at high precision (< 0.1%) and has maximum 
absolute errors of no more than 5% or better. As I will show below, 
the residual errors in the SHT-l-MG Poisson solver result in extra 
B-mode power in the shear field, which is not present for the pure 
SHT Poisson solver. However, these B-mode residuals are well be- 
low the level of shape noise expected for upcoming surveys so that 
they are tolerable in the context of making synthetic galaxy shear 
catalogs. 



4.2 Tests of the Grid Searcli for Galaxy Images 

In this section I establish the accuracy and convergence of the grid 
search algorithm for sphere described above, in the weak lensing 
regime only. First, I use a series of point mass tests with ran- 
domly located background sources. For this test I use large smooth- 
ing lengths and compute the lensing properties of the smoothed 
mass distribution exactly (see Appendix [E]), making sure that the 
smoothed mass distribution produces no strongly lensed images. 
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Figure 3. Accuracy test of the grid search algorithm for finding weaJdy 
lensed images. Images of randomly placed background galaxies were found 
around a point mass smoothed with a 13.74 arcmin smoothing length. This 
mass distribution does not produce multiple images. The plot shows the er- 
ror in the reconstructed source locations as a function of the arc distance be- 
tween the smoothed point mass and the image location. The error is defined 
as the arc distance between the reconstructed source location and the true 
source location and is a measure of the eiTor in the lensed image locations. 
In the weak lensing regime, where the relationship between the image and 
source positions is one-to-one, the grid search has errors which are < 0.2 
arcsec. 



Additionally, I make sure that the distance between the source and 
the image positions is always less than the ray buffer region used 
for the grid search and also the size of the MG patches. Images due 
to mass distributions which exhibit lensing deflections larger than 
these scales will not be captured properly because the rays will ei- 
ther not have been imported properly from adjacent MG patches 
or the rays themselves may not have been propagated as accurately 
along the line of sight because they can be deflected towards the 
boundaries of the MG patches. Second, I place sources randomly in 
a cosmological volume and test for convergence of the weak lensed 
image positions as the resolution of the ray grid is changed. 

The results of a typical test with the smoothed point mass 
setup are show in Figure [3] This figure shows the arc distance be- 
tween the source and the smoothed point mass as a function of the 
image arc distance from the smoothed point mass. Multiple im- 
ages in this figure would happen when for single source distance, 
multiple image horizontal lines of the same source arc distance in- 
tersect the locus of sources and images for this mass distribution 
multiple times. Given a computed image from the grid search, I 
show the reconstructed source position, computed with the lensing 
equations, as the blue points. The red points show the arc distance 
between the reconstructed and true source positions. In the weak 
lensing regime where only a single image is produced, these errors 
are < 0.2 arcsec. Although they are not shown, the errors in the 
shear components for the weakly lensed images are similar to those 
shown in Figure[2] Finally, if the smoothing length is decreased the 
smoothed point mass distribution will produce strongly lensed im- 
ages. The errors in the reconstructed source positions in the strong 
lensing regime are typically much larger, ~ 10 arcsec or more, in- 
creasing with the strength of the lens (but they remain roughly fixed 
in fractional error at a few percent). 

Second, in order to test the convergence of the grid search al- 
gorithm implemented on the sphere, ~ 26 million random sources 



Figure 4. Convergence test of the grid search for galaxy images. The plot 
shows histograms of the difference between the image positions obtained 
for a set of random placed sources by the grid search at different ray grid res- 
olutions. The lines from top to bottom show the difference between the im- 
age positions obtained with rays on a HEALPix order 12, 13, 14, and 15 grid 
from those obtained with rays on a HEALPix order 16 grid. The grid search 
algorithm for the sphere described above with ray grids of HEALPix order 
14 or greater have image positions converged to < 1 arcsec for Carmen-like 
resolution light cones. 

were placed in the Carmen 220 deg^ light cone. Then the light cone 
was ray traced and the images of those sources computed. The ray 
grid resolution for the ray tracing was varied between HEALPix 
orders 12 to 16. Figure |4] shows the difference between the image 
positions for the same source computed at HEALPix order 16 and 
the other resolutions, HEALPix orders 12-16. All of these images 
are in the weak lensing regime, so that one can expect the intrinsic 
accuracy of these images to be < 1 arcsec. Thus the convergence of 
the image positions in Figure|4]is largely related to changes in how 
the mass distribution is smoothed, which is also controlled by the 
ray spacing at far enough distances from the observer, as opposed 
to the behavior of the grid search algorithm itself. Nevertheless, for 
ray grid HEALPix orders greater than 14, the image positions are 
converged to < 1 arcsec for the Carmen light cone. This light cone 
has spatial and mass resolution typical of large volume simulations 
used for synthetic galaxy shear catalogs so that these convergence 
results are representative. 



4.3 Strong and Weak Scaling Tests 

The ray tracing algorithm presented in this work with the SHT-l-MG 
Poisson solver allows for versatile code implementations that have 
good performance in a variety of regimes. These regimes are typ- 
ically characterized by the ideas of strong and weak scaling. In 
strong scaling, one measures how the running time of the computa- 
tional problem changes when the size of the computational problem 
remains fixed (e.g., the sky area to be ray traced and resolution of 
the ray field) and the total number of cores participating in the cal- 
culation is varied. Perfect parallel or strong scaling means that the 
running time of the problem scales inversely with the number of 
computational cores. This kind of scaling is very hard to achieve 
in practice. In weak scaling, one measures how the running time 
of the computational problem changes with the problem size per 
computational core fixed and the total problem size varying. In the 
case of this work, one would keep the sky area per core fixed at a 
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Figure 5. Strong scaling tests of the algorithm and code for the small-area, 
220 square degree Carmen light cone. The ray tracing was done out to red- 
shift 0.32. The different color lines indicate the running times of different 
parts of the algorithm. The hashed range shows the minimum and maxi- 
mum time for a given step over all of the participating cores. The dashed 
black line indicates perfect parallel scaling with the running time inversely 
proportional to the number of computational cores used. Note that for ray 
tracing calculations with much lai'ger areas, the running time is dominated 
by the MG step as opposed to the SHT step. 



Figure 6. Weak scaling tests of the algorithm and code. The 2600 /i~^Mpc 
box was ray traced out to a redshift of 0.27 using 8 cores per 214.9 deg^, 
varying the total number of cores between 8 and 768. This range covers 
214.9 dcg^ to 20630.4 dcg^ or half of the full sky. The black, red, blue 
and magenta lines indicate the running times for the total calculation, the 
SHT steps, the MG steps and the rest of the code. The hatched band shows 
the minimum and maximum time for each step over all of the cores doing 
the calculation. When the mnning time is dominated by the MG steps, the 
calculations exhibit good weak scaling. 



single resolution while varying the total sky area and thus the total 
number of cores. A code with good weak scaling will have constant 
or decreasing total running time in this kind of test. 

The implementation of the SHT-l-MG algorithm presented in 
this work exhibits good strong scaling up to ~ 300 cores for small 
area (~ 100 square degree) ray tracing calculations, as shown in 
Figure|5] This figure shows the total running time and that of vari- 
ous computational steps for a 220 square degree ray tracing calcu- 
lation through the Carmen light cone out to redshift 0.32. This cal- 
culation used rays laid out on order 15 HEALPix cells, HEALPix 
order 12 SHT steps, and HEALPix order 7 bundle cells. It was per- 
formed with the Midwa}[3 computing cluster using Intel Xeon E5- 
2670 2.6 GHz processors cormected via FDR-10 Infiniband. The 
number of cores was varied between 4 and 1024 while the ray trac- 
ing calculation problem size remained fixed. The hatched regions in 
this figure show the range spanned by the minimum and maximum 
time for each step over all of the cores participating in the calcu- 
lation. The dashed black line shows perfect parallel scaling. The 
solid black line shows the total running time, the blue hatched lines 
show the MG step, the red lines show the SHT step and the magenta 
lines measure the rest of the running time lost to other steps in the 
algorithm, communication, I/O, and load balancing. In this calcu- 
lation I limited the number of processor cores which write output 
simultaneously to limit the load on the local file system to at most 
64, which matches the scale at which the running time of this step 
stops scaling decently. As the number of cores in increased, com- 
munications required to distribute SHT grid cells to various cores 
and the time required to output data to disk increase dramatically. 

For small area calculations such as this one, the overall run- 
ning time is dominated by the SHT step. Thus the overall parallel 
scaling is limited by the parallel scaling of this step. Note that as 
the problem size (i.e. the area to be ray traced) is increased, the 
overall running time can quickly become dominated by the MG 

|http : //rcc ■ uchicago ■ edu/resources/midway_specs . html| 



step, which exhibits nearly perfect parallel scaling. Eventually, as 
the number of cores approaches the number of bundle cells (1629 in 
this case), the discreteness in the domain decomposition will limit 
the ability of the calculation to load balance the MG step properly 
and thus achieve good parallel scaling. This effect can be seen in 
the width of the hatched region for the MG step, which increases 
with the number of cores indicating poorer load balancing. The dis- 
creteness in the domain decomposition, plus the overall scaling of 
the SHT step, primarily motivate the configuration of the SHT-l-MG 
Poisson solver. By choosing smaller bundle cells, the calculation 
will be able to run on more cores while still load balancing prop- 
erly. If enough cores are available, it may be advantageous to use a 
finer SHT grid in order to decrease errors in the SHT-l-MG Poisson 
solver. This choice comes at the expense of the running time of the 
SHT step which increases by a factor of eight for each increase in 
HEALPix order. 

In order to test the weak scaling of the code and algorithm, I 
compute the shear field for the Lb2600 light cone out to a redshift 
of 0.27, using 8 cores per 214.9 deg^. The code was run in the same 
setup as the strong scaling tests using the same machine. The results 
for varying the number of cores between 8 and 768, and thus the 
area between 214.9 deg^ to 20630.4 deg^ (or half of the full-sky), 
are shown in Figure|6] The total running time, that of the SHT step, 
that of the MG step and the rest of the running time are shown in the 
black, red, blue and magenta lines respectively. Again the hatched 
regions show the minimum and maximum time over all processors. 
Once the running time of the calculation is dominated by the MG 
step, the code exhibits good weak scaling. These results indicate 
that a full calculation out z ~ 2 or so (approximately 5 times as 
many lens planes) with the same setup for ~ 10, 000 deg^ on the 
same machine should take ~ 10k CPU hours, or approximately a 
day on 384 cores. 
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Figure 7. The convergence, shear, and rotation power spectra for the pure SHT (left) and SHT+MG (right) Poisson solvers at 2 = 1.1. In the top panels, the 
black soHd lines are the convergence power spectrum, the red solids are the shear E-mode power spectrum, the blue solid Unes are the shear B-mode power 
spectrum and the magenta solid lines are the power spectrum of the rotation angle, w. The dashed black line with h atched band shows the no n-linear power 
spectr um prediction for the shear E-mode power using the revised HaloFit fitting formula of lTakahashi et al.l )2012l) with N-body shoot noise jVale & Whiig 
I2OO3I) and cosmic variance calculated assuming Gaussian statistics. The middle panels show with the red line the ratio of the E-mode shear power to the 
convergence power and with the black line the ratio of the non-linear power spectrum prediction (HaloFit) to convergence power. The bottom panels show the 
ratio of the shear B-mode power to the rotation power. Both the ratio of the non-linear power spectrum prediction to the convergence power spectrum and the 
ratio of the B-mode power to the rotation power have been smoothed in these panels. The deviations at £ > 500 are do to the resolution of the ray field used 
to measure the power spectra and smoothing applied to the mass distribution, not the underlying N-body simulation. 



5 THE CONVERGENCE, SHEAR, AND ROTATION 
POWER SPECTRA 

In this section I present the convergence, shear, and rotation power 
spectra for full-sky multiple-plane ray tracing calculations through 
N-body light cone simulations. The Lb2600 light cone was ray 
traced out to a comoving distance of 2600 /i^^Mpc using a 
HEALPix grid of rays with HEALPix order 12. Note that this light 
cone has replications in it, but they are treated self-consistently by 
the light cone generator (M. Busha et al., in preparation) and the 
ray tracing code0 The pure SHT Poisson solver was run with 
HEALPix order 12. The SHT-l-MG Poisson solver was run with 
HEALPix order 7 bundle cells and an HEALPix order 9 SHT grid. 
The resulting convergence, shear and rotation fields were then ex- 
tracted from the inverse magnification matrices at the final plane 
using Equation Finally, the convergence, E-, B- and rotation 
mode power spectra were extracted from the HEALPix maps using 
the public HEALPix package. Note that because the maps cover the 



The light cone generator of M. Busha et al. (in preparation) places a 
single observer in the periodic lattice formed by the computational volume 
and its replications. From this single observer, a light cone is built as the 
simulation is running by looking for particles which cross the light cone 
surface in all directions in a sufficient set of replications around the observer 
to capture all matter out to some desired redshift. Thus all of the boundaries 
are continuous and the replications are treated self-consistently by both the 
N-body code and the ray tracing code. 



entire sk y, there is no ambi guity in the E/B-mode decomposition of 
the map ( iBunn et alj2003h . 

The results from this simulation are shown in Figure |7] The 
left panel shows the power spectra from the simulation using a pure 
SHT Poisson solver whereas the right panel shows the same power 
spectra using the SHT-l-MG Poisson solver. The top panels in this 
figure show the convergence, shear E-mode, shear B-mode, and ro- 
tation angle power spectra. Additionally, the prediction in the Lim- 
ber approximation for the convergence power spectrum is shown by 
the dashed line , computed with the rev ised HaloFit non-linear fit- 
ting formula of lTakahashi_etal. 1 2012h and the linear power spec- 
trum fitting formula of lEisenstein & Hul ( Il998h . I have added the 
contribution of N-body shot noise to this prediction as described in 
iVale & White (2003). The hatched region shows the range of ex- 
pected cosmic variance assuming Gaussian statistics, a dece nt ap- 
proximation in the linear regime (see, e.e.. ISato et alj|2009h . The 
middle panels show the ratio of the non-linear prediction and the 
E-mode shear power spectrum to the measured convergence power 
spectrum. The large deviations of the power spectra from the theo- 
retical predictions at ^ > 500 are due to the smoothing applied to 
the mass distribution and resolution of the ray fiel d, not the under- 
lying N -body simulation. I have confirmed that the lTakahashi et al] 
( 120 121) fitting f ormula reproduces the simulation results more accu- 
rately than the ISmith et al] ( l2003h formula, wh ich is low by a few 
percent at these scales as noted pre viously (e.g.. lHilbert et al .I2OO9I : 
lEifleil201 iklTakahashi et"ai]l2012h . 

As expected at second order in General Relativity, in both 
cases the shear E-mode and the convergence power spectra are 
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equal to high precision. For the SHT+MG Poisson solver there is a 
slight modulation of the shear E-mode power on the scale of a few 
degrees, presumably arising from the boundary condition interpola- 
tion for the MG patches. However this numerical artifact in ampli- 
tude is < 2%, so that it will be undetectable for a DES-like survey 
(a signal-to-noise of < 1) and only marginally so for a LSST-like 
survey (a signal-to-noise of < 2). These signal-to-noise calcula- 
tions were done using typical parameters for the lensing source 
density (12 and 40 galaxies per arcmin'^ for DES and LSST re- 
spectively) and survey area (1/8 and 1/2 of the sky for the DES 
and LSST respecti vely) assumin g Gaussian statistics at a variety of 
redshifts (see e.e.. iHuterer & Takada.2QQ5IV In both cases the B- 
mode and rotation mode power spectra are suppressed by nearly a 
factor of 10^, though this number is redshift dependent. These re- 
sults are qualitatively consistent with those already in the literature 
for the amplitude of the B-mode and rotation mode power spectra 



spe 



(e.g. Jjain et alj|2000l : lw"e & Whitell2003l : lHilbert et alj|2009 

For the first time, one can directly compare the amplitude of 
the B-mode and rotation mode power spectra from the ray tracing 
simulations. These are expected to be equal at sec ond order in the 
gravi tational p otential from perturba tion theory jffirata & SeliaS 
I200I see also lKrause & HiratallioToh . The bottom panels of Fig- 
ure |7] show the ratio of the shear B-mode power to the rotation 
power. While the pure SHT simulations are consistent with this 
result to a percent or less, the SHT-fMG simulations have a sig- 
nificant amount of extra B-mode power in the shear field. This ex- 
tra numerical B-mode power is roughly fixed in amplitude while 
the true B-mode and rotation power increase with redshift. Thus at 
lower redshifts, the B-mode power can be a factor of ~ 10 above 
the rotation mode power. As discussed above, this extra power ar- 
rises from the boundary condition interpolation. Note however that 
this extra power is well below the expected level of shape noise 
for upcoming weak lensing surveys so that it is undetectable when 
shape noise is included (a signal-to-noise <^ 1 for both the DES 
and LSST). Thus, while the SHT+MG Poisson solver does not al- 
low the ray tracing calculations to work completely correctly at 
second order, for future surveys in the context of making synthetic 
catalogs, these errors are acceptable. Finally, note that given the 
sensitivity of the amplitude of the B-mode power to the numerical 
details of the simulations (see also lHilbert et alj2009h . I have made 
no attempt to compare the B-mode power spectrum amplitude in 
the simulations to that expected from perturbation theory as done 
in lHilbertetal.c200Si) . 



6 CONCLUSIONS 

In this work I have presented CALCLENS, a new curved-sky ray 
tracing algorithm and code appropriate for current and future large- 
area sky surveys. The key feature of this code is the new SHT+MG 
Poisson solver for the sphere. While this Poisson solver has more 
noise than pure SHT Poisson solvers, it enables large areas to be 
ray traced quickly on a curved sky at high resolution. Additionally, 
this extra noise is well below the expected level of shape noise for 
current and upcoming surveys. Thus this new code is ideally suited 
for the generation of full-sky synthetic galaxy shear catalogs for 
verifying the accuracy of cosmological constraints from upcoming 
surveys. These synthetic galaxy shear catalogs can also be used as 
inputs to image simulations in order to test the process of produc- 
ing cosmological constraints with weak lensing completely from 
images to error bars. 

I have also investigated second-order lensing effects using 



full-sky E/B-mode decompositions which are c omplete and free 
of ambiguous modes in the shear field jBunn et al. 200^. Us- 
ing slower, but more accurate pure SHT calculations I have ver- 
ified that the B-mode shear and rotation mode power are equal 
to high precision (< 1%), as predicted from perturbation theory 
l lHirata & Seliakl2003h . I find that the updated Hal oFit prescription 
for th e non-linear power spectrum presented by iTakahashi et alj 
( l2012h used in the Limber approximation match the amplitude of 
the convergence and shear power spectr a from ray tracing to a few 
percent, whereas the lSmith et al.l ( l2003h prescription does not. 

The calculations and methods presented in this work can be 
extended in several ways. Upcoming, high-sensitivity and high- 
angular resolution observations of CMB temperature and polariza- 
tion signals will allow for high-signifi cance CMB l e nsing measure- 
ment s (e.g.. iGuzik et alJ I2O OO: Smi th et alj 20061: Ide Putter et alj 
I2OO9I : iMcMahon et alj l2009l : iNiemack et alj|2010l) . These CMB 



lensing maps can be cross-correlated with observations of galaxies 
from surveys to extract c onstraints on galaxy bias and other cosmo- 
logical parameters (e.g. . Ide Putter et al]l2009l : [sherwin et al.ll20l3 : 
iBleemetal. 20l3). Thus producing lensed CMB temperature and 
polarization maps for the same underlying large scale structure in 
which the galaxies have been placed self-consistently will be im- 
portant for testing and understanding these methods. In fact such 
simulations for the 220 dcg^ Carmen light cone have already been 
made and used in Bleem et al. (2012) with data from the SPtF^ to 
study the galaxy bias of galaxies detected in the Blanco Cosmology 
Survey, similar to studies to be performed with the combination of 
data from the SPT and the DES. Lense d CMB signals may be usefu l 
to study cluster mass profiles as well jSeliak & Zaldarriagall2000l) . 

Current and future large-area sky surveys will present an un- 
precedented data analysis challenge for WL studies, both in terms 
of the measurement of shear signals from pixelized images and 
also in terms of the modeling required to reliably extract cosmo- 
logical information from these measurements. In this work I have 
presented a new algorithm and code capable of producing full-sky 
weak lensing shear simulations, suitable for testing data analysis 
procedures and verifying cosmological constraints from these sur- 
veys are free of systematic errors. Combined with methods that 
place galaxies self-consistently into cosmolog ical light cone sim- 
ulations (e.g., GALACTICUS. iBensonI I2OO or ADDGALS, R. 
Wechsler et al., in preparation; M. Busha et al., in preparation), 
this code can be used to make synthetic galaxy shear catalogs for 
use with a variety of current and future surveys. As the amount and 
quality of WL data increases, I expect these shear catalogs, and 
the algorithms needed to produce them, to be increasingly useful 
and necessary in order to realize the scientific potential of the next 
generation of large-area sky surveys. 
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APPENDIX A: WEAK GRAVITATIONAL LENSING BY 
POINT MASSES ON THE SPHERE 

In this Appendix I review differential geometry on the sphere and 
present the solution for the shear and deflection angles due to a 
point mass on the sphere. On the sphere the metric is 

ds^ — dO^ + sin^ 6 d(t>^ 

and the non-zero Christoffel symbols are 



second derivatives of a scalar function 'I' are 



- ^0 



— sm ti cos f 
cos 9 



"'O sine' 
Given the metric I can also define the following basis vectors 

eg = 9 



sin 9 (f) 



e 



where 9 and (p are the standard unit basis vectors on the sphere and 
the other quantities defined above satisfy the following relations 
from differential geometry 

gab = Ba ■ Gf, 

ab a b 

g = e ■ e 

where {a, b} index the 9- and ^-components. For future reference 
the gradient of a scalar and the covariant derivative of a vector field 
on the sphere in this notation read 

VV' = (aaV)e° 

V(,Ua = dbVa — Fat^c 

where v is a vector field with components v — vge'^ + u^e"^. For 
completeness the expressions in component form for the first and 



1 



sm f 

a|* 9® 9 

1 _2 ^ COS 9 . ^ 7 

sm^ 8 sm9 
1 „ „ , cos 9 „ ^ 



sm y ' sin^ 9 ' 
In these expressions all derivatives in the 0-direction have been 
scaled by 1/ sin 9 to account for the non-unit length of e*^. Finally, 
for the 1-axis along the ^-direction and the 2-axis along the (f>- 
direction, the magnitudes of the deflection and shear components 
in terms of the derivatives given above are 



Clg 



71 



1 



(V<;,V^* - VflVfl*) 



72 = V^Vs* 

The Poisson equation for a point particle at the top of the 
sphere reads 

where is the Laplacian operator which for my choice of coordi- 



nates IS 



d 



+ 



I have subtracted the mean density of the point particle on the 
sphere from the overall density in the Poisson equation above, an- 
ticipating the result that 1 = modes (i.e. constant modes) of the 
density field can be set to zero when solving for the potential. The 
solution to the Poisson equation above is 

*(«» = ^log[l-cose] . 

This equation has no (j) dependence because I have placed the point 
particle at the top of the sphere. 

Expressions for the deflection angle and shear due to a point 
mass can be obtained from the expressions for the gradient and the 
covariant derivative of a vector of the sphere given above. Since I 
will store the shear in an orthonormal basis aligned with (9, 0) on 
the sphere, the 4> ~ 4> component of VaVi,'I'(6', <j>) must be scaled 
by 1/ sin^ 9 in order to account for the non-unit length of e'*. The 
deflection angle and shear components for a point mass at the top 
of the sphere are 

1 sin 6* 



ag = 



71 



47r 1 — cos 9 







11 + cos t 



Stt 1 — cos 9 
72 = 

The re sults for the shear can be obtained from lde Putter & Takadal 
as well. Once the deflection angle and the shear for the 
point particle are known when the point particle is at the top of the 
sphere, it is straight forward to generalize the expressions for any 
two points on the sphere. Given the point particle's position and 
the position under consideration, one simply computes the "paral- 
lactic" angle between the great circle connecting the point particle 
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and the position under consideration and the +Q axis of the local 
basis in which the shear is stored at each location on the sphere 
(i.e. 6). This angle uj can then be used to rotate both the shear and 
deflection angle into the local basis as needed. 



APPENDIX B: THE PARALLEL TRANSPORT OF 
TENSORS ON THE SPHERE ALONG GEODESICS 

In this section I discuss the parallel transport of tensors on the 
sp here. The discussion in this section follows closely the notation 
of lHobson et"ai] ( l2006h . In general, the parallel transport of a tensor 
t"*' along a curve x'^{u) parametrized by an affine parameter u is 
given by the solution of the following equation 



„ De" dt"" dbdx" „b addx" 

= -jr- = -3- + r^ct -3- + rLi — 

Uu du du du 



(Bl) 



where the F^^ are the Christoffel symbols and the notation D/Du 
denotes the intrinsic derivative along the curve x'^. The ray propa- 
gation algorithm defined above needs the parallel transport of the 
inverse magnification matrix (a tensor on the sphere) along great 
circle arcs (geodesies of the sphere) which connect each ray with 
its location on the next lens plane in angle. 

A simplified prescription for this operation can be defined as 
follows. Consider the geodesic along the "equator" of the sphere at 
9 = 7r/2. Along this path I can write x(it) = (7i"/2, u) where I 
now have used the arc-length as u. This parameter is affine for this 
geodesic since the tangent vector to the geodesic, e**, is the same 
at all points along the geodesic. I have neglected an over all shift 
in "phase" around the sphere so that the parallel transport starts at 
u = Ui = Q and ends at some u = Uf. With this definition of the 
geodesic x"^, one sees that all of the Christoffel symbols from Ap- 
pendix |A] vanish identically. Thus parallel transport for any tensor 
along this geodesic is trivial; its components remain constant. For 
this geodesic also note that the tensor components are defined in 
a basis in which one of the basis vectors which is always aligned 
with the tangent vector of the geodesic. 

Using this construction, it is straight forward to generalize 
the parallel transport of any tensor between any two points on the 
sphere connected by a geodesic. One simply first rotates the ten- 
sor components in the original coordinate system into a coordinate 
system where the geodesic is along the "equator" of the sphere. 
Then the tensor is parallel transported along the geodesic in this 
new coordinate system, where its components remain unchanged 
during this operation. Then at the final location of the tensor the 
components are rotated back into the old coordinate system, being 
careful to note that the tensor basis at the final point in the old co- 
ordinate system differs from the tensor basis at the initial point in 
th e old coordinate system. T his procedure is exactly that described 
bv lChallinor & ChonI ( |2003) . but the previous discussion has estab- 
lished that it applies not only to traceless, symmetric polarization 
tensors, but to all tensors defined on the surface of the sphere which 
are parallel transported along geodesies. 

In practice I use the following equivalent prescription for par- 
allel transporting tensors along geodesies on the sphere. I simply 
perform a three-dimensional rotation of the local tensor basis 3- 
veetors about the center of the sphere. The axis and angle of this 
rotation is fixed by requiring them to trace the geodesic path used 
for parallel transporting. Then using the rotated tensor basis vectors 
and those at the final location of the sphere, I compute the angle by 
which to rotate the tensor components. Finally using this angle, I 
express the tensor components in this final basis. Note that this pro- 



cedure applies just as well to vectors on the surface of the sphere 
by the same argument as given above. 



APPENDIX C: EQUIVALENCE OF THE RECURRENCE 
RELATIONS FOR THE INVERSE MAGNIFICATION 
MATRIX 

The recurrence relation for the inverse magnification matrix ob- 
tained directly from the discretization of the lensing equations (see 
Equation|7j 



I 



E Djk 



(Cl) 



where is the inverse magnification matrix for the rays at the fc-th 
lensing plane, JJi is the matrix of second partial derivatives of the 
lensing potential at the i-th plane, Dk = r{xk), and Dik = r{xk — 
xO- I have switched to matrix notation here with bold symbols 
denoting matrices. With this definition of Dik one can verify the 
identity 



Dac = 



Dt 



(C2) 



directly from the properties of the sine and hyperbolic sine func- 
tions. Additionally in this notation Dac ~ —Dca- 

With this identity in hand, I can now verify that Equa- 
tion il2\ is equivalent to Equation l lClb using induction (see 
ISeitz&SchneideJll992l for a similar proof). Once can verify di- 
rectly that up to = 2 the two relations are the same. Now assume 
they are equivalent up to fc — 1. Then for Afc I get by substituting 
Equation dCll l into Equation l ll2b 



1 - 
Dk-i 



k-i 



D,, 



k-2.k 



Dk Dk~2,k-1 
Dk-2,k . 



Dk Dk~2,k-1 

Dk-i,k 



fc-3 



Dk 
Dk-i 



Vk-iAk-i 

Dk-2,k 



Ak-2 

_ Dk-l,k 

Dk-2,k 
A~ 

Di±-2 



Dk Dk~2.k-1 
Di,k-2 



Dk 



-Ufe_iAfc_i 

Ufc_.2Afc_2 

Dk-1 



Dk 



U,A, 



(C3) 



By applying the identity in Equation ( IC2b twice to the expression 
in the brackets, I get 



Dk-l Dk-2,k j 


Di,k- 


Dk Dk-2,k-l \ 


Dk- 


Dk-1 Dk 


-2,k 


Dk Dk- 


2,fc-l 


Di^k-2 




Dk-2 





A. 



Dk-1 J Dk-2 

Dk-iDi^k-2 + Dk-i,iDk-2 
A-iA-2 



Dk 



kDi Di, 



DkDk-2 Dk-2 
DiDk-2,k + Di h-2Dk 



DkDk-2 



Dik 
' Dk 



Thus by induction Equations il2l and JClt are equivalent. 
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APPENDIX D: RADIAL SHEAR INTERPOLATION FOR 
GALAXY IMAGES 

In this appendix I verify that the radial shear interpolation for find- 
ing galaxy images presented in Section [23l is equivalent to properly 
computing the inverse magnification matrix for the galaxy from the 
previous two lens planes. Assume a galaxy with comoving distance 
Xg falls in lens plane n + 1 so that Xn+1-1/2 < Xg < Xn+i+i/2- 
Then the correct equations to compute the inverse magnification 
matrix at the galaxy's radial location is (cf. EQuation ll2b 



As = 



1 - 



D„ D 



Dg IDn—l^n 



D 



1—1,9 



Dg Dn — ljiz 



-IT A 



(Dl) 



where I have now switched to the notation of Appendix[C] For prac- 
tical reasons, CALCLENS actually implements the radial interpo- 
lation using the following equation 



+ - 



Dg D 



n,n-^ 1 



(D2) 



This last equation matches directly the procedure described in Sec- 
tion l2.3l By direct substitution of Equation l ll2b into Equation l lD2b . 
one can verify that the relation given in Equation dPlb is in fact 
equivalent to that in Equation iD2\ . This derivation requires the 
identity given in Equation iC2\ and also the following identity 

DacDbd — DabDcd 



Dad = 



Dt 



(D3) 



which again can be verified directly from the properties of the sine 
and hyperbolic sine functions. 



this work by a factor of two (i.e., «: — > fi:/2 in Equations (Ell and 
l lE4t given above). Second, one expects that at scales greater than 
smoothing length of the kernel, the solution for the deflection angle 
and shear should equal tha t of a point mass derived ab ove. In order 
for the formula derived bv lde Putter & Takadal ( I2OI0I) to reproduce 
the expected large scale limit, one must subtract the mean density of 
the kernel on the sphere, l/47r. Finally, the Epanechnikov k e rnel is 
symmetric in (f> so that the results from lde Putter & Takadal ( I2OI0I) 
can be used to compute quantities which are not angle averaged. 
Thus the equation being solved is 



(E5) 



With these details in mind, the deflection angle and shear com- 
ponents for an Epanechnikov kernel placed at the top of the sphere 



as — 

Ctij, = 

71 = 

72 = 



-h{d; cr) 



< a 



--2 {^T^oK0; o 

_ 1 i+coB e 



where h{9; a) is 
h(e; a) 



A/'((t)(1 - cos 6') 

(-2sinc(6l/7r) + cos 6* + sinc(e/2/7r)^ 



— cos i 



J_ 

An 



(E6) 
(E7) 

<a (E8) 

(E9) 

(ElO) 



(Ell) 



APPENDIX E: THE DEFLECTION ANGLE AND SHEAR 
FOR THE EPANECHNIKOV KERNEL ON THE SPHERE 



I use the results of Ide Putter & Takad to compute 

the deflection ang l e and shear of the Epanechnikov kernel. 
Ide Putter & Takadd ( 1201 Ol) derived the full-sky relationship be- 
tween the angle-averaged convergence and the angle-averaged tan- 
gential shear. 



(El) 



1 + cos i 

where one is averaging around a point at the top of the sphere and 
^«9) = 7^-. —i / d(j)de' sine' K{e' ,(l>) (E2) 



2tv{1 — cos I 

2tt 







d(t)K{e',(t>) 



(E3) 



In order to c ompute the deflection angle , I use the following rela- 
tionship from lde Putter & Takadj l l2010h 



sin 9 



2(1 - cos^ 



(E4) 



where is the angle-averaged partial derivative of the lensing 

potential ^, similar to Equation iE3l . 

There are however a few import ant details to consider be- 
fore completing the calculation. First. Ide Putter & Taka da ( 20101) 
use the standard definition of the convergence which differs from 
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